Power, Energy and Speed of Embedded and Server Multi-Cores applied to Distributed Simulation of Spiking Neural Networks: ARM in NVIDIA Tegra vs Intel Xeon quad-cores

نویسندگان

  • Pier Stanislao Paolucci
  • Roberto Ammendola
  • Andrea Biagioni
  • Ottorino Frezza
  • Francesca Lo Cicero
  • Alessandro Lonardo
  • Michele Martinelli
  • Elena Pastorelli
  • Francesco Simula
  • Piero Vicini
چکیده

This short note regards a comparison of instantaneous power, total energy consumption, execution time and energetic cost per synaptic event of a spiking neural network simulator (DPSNN-STDP) distributed on MPI processes when executed either on an embedded platform (based on a dual-socket quad-core ARM platform) or a server platform (INTELbased quad-core dual-socket platform). We also compare the measuer with those reported by leading custom and semi-custom designs: TrueNorth and SPiNNaker. In summary, we observed that: 1we spent 2.2 micro-Joule per simulated synaptic event on the “embedded platform”, approx. 4.4 times lower than what was spent by the “server platform”; 2the instantaneous power consumption of the “embedded platform” was 14.4 times better than the “server” one; 3the server platform is a factor 3.3 faster. The “embedded platform” is made of NVIDIA Jetson TK1 boards, interconnected by Ethernet, each mounting a Tegra K1 chip including a quad-core ARM [email protected]. The “server platform” is based on nodes which are dual-socket, quad-core Intel Xeon CPUs ([email protected]). The measures were obtained with the DPSNN-STDP simulator (Distributed Simulation of Polychronous Spiking Neural Network with synaptic Spike-Timing Dependent Plasticity) developed by INFN, that already proved its efficient scalability and execution speed-up on hundreds of similar “server” cores and MPI processes, applied to neural nets composed of several billions of synapses.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scaling to 1024 software processes and hardware cores of the distributed simulation of a spiking neural network including up to 20G synapses

This short report describes the scaling, up to 1024 software processes and hardware cores, of a distributed simulator of plastic spiking neural networks (DPSNN). A previous report demonstrated good scalability of the simulator up to 128 processes. Herein we extend the speed-up measurements and strong and weak scaling analysis of the simulator to the range between 1 and 1024 software processes a...

متن کامل

Gaussian and exponential lateral connectivity on distributed spiking neural network simulation

We measured the impact of long-range exponentially decaying intra-areal lateral connectivity on the scaling and memory occupation of a distributed spiking neural network simulator compared to that of short-range Gaussian decays. While previous studies adopted short-range connectivity, recent experimental neurosciences studies are pointing out the role of longer-range intra-areal connectivity wi...

متن کامل

An evaluation of contemporary heterogeneous computing platforms for data intensive applications

The end of Dennard scaling is forcing innovation in computing architectures; the old regime, where exponential benefits accrued by using a newer process technology, no longer holds. One path of innovation is to exploit heterogeneous platforms, and match applications to the platforms on which they execute efficiently. Heterogeneous processing in the server domain is a big hardware/software desig...

متن کامل

JetsonLeap: A Framework to Measure Energy-Aware Code Optimizations in Embedded and Heterogeneous Systems

Energy-aware techniques are becoming a staple feature among compiler analyses and optimizations. However, the programming languages community still does not have access to cheap and precise technology to measure the power dissipated by a given program. This paper describes a solution to this problem. To this end, we introduce JetsonLeap, a framework that enables the design and test of energy-aw...

متن کامل

Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for costefficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1505.03015  شماره 

صفحات  -

تاریخ انتشار 2015